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Abstract 

This paper deals with the filtering problem for a class of discrete 
time stochastic volatility models in which the disturbances have ratio- 
nal probabihty density functions. This includes the Cauchy distribu- 
tions and Student t-distributions with odd number of degrees of free- 
dom. Using state space realizations to represent the rational probabil- 
ity density functions we are able to solve the filtering problem exactly. 
However the size of the involved state space matrices grows exponen- 
tially with each time step of the filter. Therefore we use stochastically 
balanced truncation techniques to approximate the high order rational 
functions involved. In a simulation study we show the applicability of 
this approach. In addition a simple method of moments estimator is 
derived. 

Keywords: stochastic volatility, filtering, rational probability density func- 
tion, state space realization, stochastically balanced truncation. 
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Introduction 



In the area of financial time series the Black-Scholes model is often used 
for modelling the behaviour of the price of stocks, exchange rates and other 
financial time series. This is also the basis for much of the literature on 
pricing of derivative financial instruments such as options. However it is 
considered to be a well-known fact that although the volatility is assumed 
to be constant in the Black-Scholes model, in practice it is varying. This 
has led to the investigation of more general models in which the volatility 
is allowed to vary. One can broadly distinguish between two types of gener- 
alizations. One is the type of model in which the volatility is varying over 
time and its dynamic behaviour is described by some stochastic process. A 
problem with such models is that it is generally difficult to solve the volatil- 
ity estimation problem for such models: the calculation of the conditional 
density of the volatility at some point in time, given the observations up 
till that same point in time, is usually a difficult task for which there are 
no closed form expressions. In the literature there are several proposals to 
approximate the conditional density, cf. e.g. [11], [14], [1]. The other, sec- 
ond type of model that is used is the ARCH model and its generalizations 
([4], see also e.g. [7]), as applied to financial time series. These models have 
the advantage that the volatility is again time varying, and the conditional 
volatility (also called conditional hctcroskcdasticity in this context) is in fact 
prescribed by the model as a deterministic function of the past observations. 
By construction the problem of estimating the stochastic volatility has been 
solved in these models. However one could argue that this is at the expense 
of a less transparent model for the underlying data generating process. In 
the present paper a model of the first type will be presented, however with 
the advantage that for this model the volatility estimation problem can be 
solved, as we will show. Apart from the volatility to be time-varying an- 
other feature of financial time series that is often reported is that it has fat 
tails. In the literature there are many studies that try to deal with this phe- 
nomenon by specifying non-Gaussian disturbances. This goes back to the 
work of [15] who suggested to consider the class of stable distributions as 
possible distributions for the disturbances. An important example of stable 
distributions is given by the Cauchy distributions. More recent studies seem 
to favor other distributions, including Student t-distributions (cf e.g. [2] p. 
19, [13]). In the approach followed in the present paper all disturbances are 
allowed which have a rational probability density function on the real line. 
This includes the Cauchy distributions and Student t-distributions with odd 
number of degrees of freedom. In fact it is well-known that the Gaussian 
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distribution can be approximated by a Student t-distribution of sufficiently 
high number of degrees of freedom. Therefore in a sense the corresponding 
Gaussian model is a limiting case of the class of models presented here. It 
should perhaps be stressed from the start that there is a price to be paid in 
the form of high complexity if one wants to use rational densities of higher 
(McMillan) degree. From the point of view of complexity in fact the esti- 
mation problem is easiest when the disturbances have Cauchy density. In a 
previous paper a matrix calculus was developed for performing various cal- 
culations with rational probability density functions ([10]) and applied to a 
filtering problem for a class of linear dynamical models. Here we extend this 
calculus and show how it can be fruitfully applied to the non-linear filtering 
problem of volatility estimation, in a specific class of stochastic volatility 
models. 

The main extension of the calculus concerns a state-space formula for 
the composition of a proper rational function (which can be allowed to be a 
proper rational matrix function) with a proper rational function, under some 
minor condition that is required to ensure the resulting (rational) function 
is again proper. 

It is shown that the conditional probability density functions for the 
state are all rational functions in this model class and we provide an explicit 
way to calculate these and hence solve the filtering problem exactly. How- 
ever as the complexity of the resulting rational probability density functions 
increases very quickly over time, the exact filter cannot be implemented 
practically, except during a short period of time. An important innovation 
in this respect is the application of an approximation method stemming from 
stochastic systems theory, called the SBT (stochastically balanced trunca- 
tion) method. This method allows to find a lower order positive rational 
density function which differs at each point on the real line by at most a 
given prescribed percentage of the original rational density function. In an 
application we use a tolerance level of 2% giving excellent results. (The 
bound used is well-known in stochastic systems theory and is based on the 
deep and elegant theory of Hankel norm approximation) . In the implemen- 
tation of the filter one needs to switch between various representations of 
the rational probability density functions. Numerically reliable methods are 
presented to perform these steps. The possibility to implement the vari- 
ous theoretical ideas in a numerically stable way is crucial for the success 
of the practical implementation and forms one of the key contributions of 
this paper to the practical usage of rational probability density functions in 
filtering problems. We provide the results of some applications to simulated 
data and to empirical FX (foreign exchange) data and present a number 
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of conclusions. A number of the technical results used are collected in an 
appendix. 

1 The model class 

Stochastic volatility models that we will consider are of the following form: 

Xt+i = aXt + Wt , . 

Yt = ViXtWt ^ ' 

where for each t £ N = {1,2,...}, the random variables Xt,Wt,Yt,Ut take 
their values in the real numbers, and where V{x) is a real-valued, positive 
polynomial function of x G M; {Wt, t G N} and {Ut, t G N} are sequences 
of jointly stochastically independent real valued random disturbances with 
time-invariant probability density functions: for each t G N, Wt has ratio- 
nal probability density function pwiw), Ut has rational probability density 
function pu{u). The initial state Xi has rational probability density function 
PXi{x)- The parameter a is a real number that will be assumed to be un- 
equal to zero for ease of exposition. In financial applications, the Yj usually 
stand for the returns Yt = log{St+i/St) of some price process {St, t G N}. 
A number of remarks can be made about this model class. 

(i) The family of rational probabihty density functions is a very rich class. 
It contains the stable class of Cauchy densities, it contains the Student 
distributions with odd number of degrees of freedom. Under relatively 
mild conditions, probability density functions can be approximated by 
rational probability density functions, as follows from results of the 
theory of rational approximation. It is well-known that the Gaus- 
sian probability density functions can be approximated for example 
by the Student t-distribution of sufficiently high number of degrees of 
freedom, therefore the Gaussian case appears in a certain sense as a 
limiting case of our model class. 

(ii) The function F is a positive polynomial, i.e. for all x G M, V{x) > 0. 
Here this is required for technical reasons. In the literature one finds 
other positive functions as specifications for V as well, for example an 
exponential function V{x) = exp(^^) (cf. e.g. [20]). If desired one 
can approximate the exponential function on any given finite interval 
by a positive polynomial. Generalization of the results presented here, 
to the case in which F is a non-negative polynomial, i.e. for all x G 
M, V{x) > is straightforward. 
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(iii) The parameters in the model as well as in the rational probability den- 
sity functions of Wt and Vt, t = 1,2, . . . , are assumed to be constants 
here. However they could be taken time-varying if desired. The re- 
sulting filter equations for that case form a straightforward extension 
of the filter equations presented in this paper. 



2 The filter 

We consider the following nonlinear filtering problem: Estimate at each time 
t eN the volatility V{Xt) from the sequence of observations := {Yg, s G 
N, s < t}. (Note that we will use the same symbols Y^ := {1^, s G N, s < 
t} for the random variables and their observed values. This is to avoid 
complicating the notation any further. The interpretation of the symbols as 
random variables or observed values should be clear from the context). The 
solution of such a problem consists of finding for each t € N the conditional 
probability density function of Xt given Y^, and deriving the desired estimate 
of V{Xt) from this. Let the conditional density of some random variable Z 
given y/ be denoted by Pz\Y{- 

The filter consists of a set of recursive equations by which one can cal- 
culate the conditional probability density function of the state Xt given the 
observations Y"/. The filter consists of a prediction step and an update step. 
In the prediction step one calculates the conditional density Pxt+i\Y^ -'^t+i 
given the observations Y^ starting from the conditional density Pxt\Y* of Xt 
given y/ : 



PXt+i\Y{ =PaXt\Yl *PW- 

Here * denotes convolution. 

In the update step one calculates the conditional density of Xt given the 
observations Yj* from the observation Yf and the conditional density of Xt 
given y/~^, using Bayes' rule. Suppose the conditional probability density 
function Px^\y*-'^ °f -^t given Y^~^ is known and the observation Yt becomes 
available. The joint density of {Xt, Yt) can be obtained from the joint density 
of {Xt, Ut) by a change of variables: 

Xt\^( Xt 
Yt J V y{Xt)Ut 

The inverse Jacobian determinant of this change of variables is y^Xt), which 
is positive because F is a positive polynomial. It follows that the joint 



5 



density of {Xt,Yt) is given by p^^ y^^yt-i{x,y) = p^^^^^^yt-iix, vfx))v(x) = 
Pxt\Y*~'''(^^Pu(v{!i^^ vjx) ' Substituting y = Yt, we obtain the following ex- 
pression for the density of Xt\Yi : 

1 Y 1 

Px^w^i^) = ^Px,\Yri^^P^^y^)^v{^y 

where 

f-oo 2 

Since p-^^ y|yt-i(x, It) = ^'xt|K*(^)Pyt|y'-i(-^t) follows that 

Ct = J Px y |yt-i(a;, lt)dx = pyiyt-i(it). Therefore we may evaluate the 

likelihood as 

pyi,y2,...,yT(^i>^2, ...,Yt) = c\C2 ■■■ct. 

Note that the value of the normalization constants ci, C2, . . . , ct easily fol- 
lows from Proposition 3.1 without the need for explicit integration. 

As shown in [10] the convolution of two rational density functions is a 
rational function too. Therefore it follows easily that the conditional density 
functions defined above will all be rational, given our assumptions! A way 
to implement the filter using ideas from system theory will be presented in 
the next sections. 



3 State-space calculus for rational probability den- 
sity functions 

3.1 Introduction to the state-space calculus 

The key idea is to identify rational densities with spectra of linear, dynamic, 
continuous time, finite dimensional systems. This allows us to use concepts 
and methods from systems theory; for an overview cf, e.g, [19]. 

Consider a rational non-normalized probability density function p{x). 
With it we associate a rational function $(s) on the complex plane which is 
specified on the imaginary axis by 

p{x) = <^(ix), Vx G M. 

Note that <J>(-) is a rational function which is nonnegative and integrable 
on the imaginary axis. Such a function will be called an integrable spectral 
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density in this paper. The function $ has a representation as 
go + gis-\ |-£?2qS^^ 



-\ n> q; gk,fk e C 



(2) 



/0 + /lS+--- + /2nS2"' 

with coprime polynomials g{s) = go + gis + • • • + g2qS^'^ and f{s) = /o + 
fis + • • • + /2raS^"'- Since is strictly proper there exists a state space 
representation, i.e. a triple [F G 'C?'^'^'^'^ , G & C^^^^,^ € C^^^n]^ g^^.^^ ^-^^^ 

^{s) = H{sl2n-F)-^G. 

Note that we need complex valued triples as p may be non-symmetric. Fur- 
ther note that this representation is not unique. A state-space transforma- 
tion [F,G,H] ^ [TFT-'^,TG,HT-^], where T G C^^^^n ^ non-singular 
matrix, leads to a usually different state-space representation of the same 
function As a shorthand notation for such a state space realization we 
will write: 



TT 



" F 


G ' 


H 






(3) 



where vr will be used in general to denote the mapping that maps a par- 



titioned matrix 



" A 


B ' 


C 


D 



to the corresponding rational function C{sl — 



A)^^ B + D . It is assumed that the partitioning involved will be clear from 
the context in all cases. Clearly tt is invariant under state-space transfor- 
mation. 

Since $(ix) > holds for all x G M, there exists an additive as well as a 
multiplicative decomposition of of the form 

= Z{s) + Z*{s) = K{s)K*{s) 

where Z and K are strictly proper. The rational transfer function Z{s) is 
called a spectral summand and K{s) is a spectral factor. Here for a rational 
complex function G(s), G*{s) is defined as G*(s) = G(— s), where z denotes 
complex conjugation. In particular note that #*(s) = $(s) holds. Since 
$(s) has no poles on the imaginary axis, a stable summand Z{^s), and a 
stable factor K{s) may be chosen, i.e. K{s) and Z[s) have no pole in the 
closed right half plane. From now on we always impose stability on Z and 
K. 

Since Z, K are strictly proper rational functions, there exist state space 
representations: 



K = TT 



A 


B ' 


C 


_ 



K* =TT 



-A* 


C* 


-B* 
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TT 



A 


M ' 


C 






z* 



TT 



-A* 


C* ' 


_ -M* 






Here and elsewhere in this paper M* denotes the Hermitean transpose of a 
matrix M. It is important to note that the two above reahzations may be 
chosen to share the A and C matrix. 

Given these state space reahzations for Z{s) and K{s), we may construct 
two alternative state space realizations for 



F 


G ■ 


H 






A 





M ' 







-A* 


C* 


= TT 


c 


-M* 








F 


G ' 


H 


_ 



(4) 



" F 


G ' 


H 






A 


-BB* 










-A* 


c* 


, $ = TT 


C 











" F 


G ' 


H 






(5) 



using standard formulas for the state space realizations of the sum and 
product of two rational functions, see Appendix A. 2. 

The co-degree of a proper rational function G{s) is defined as the multi- 
plicity of the zero of G at infinity. Thus the co-degree of $ is 2n — 2q, see (2). 
Clearly the co-degree of ^ is twice the co-degree of its spectral factor K and 
thus is even. For a more detailed discussion on the co-degree and the zeros 
of a rational function see Appendix A.l. 

As shown in [10] the following proposition concerning the normalization 
constant and the moments of a rational density holds: 



Proposition 3.1 Let X be a real random variable with non-normalized ra- 
tional probability density function p with corresponding spectral summand 
Z, hence ^{ix) := p{x) = Z{ix) + Z*(—ix), and let {A,M,C) be a stable 
state-space realization of Z. Then CM is real and positive and 2nCM 
probability density function corresponding to X. The moments E(X') of X 
exist for I = 0, . . . ,k — 2, where k is the co-degree of^ and the moments are 
given by E{X^) = (-i)' / = 0, 1, 2, . . . , A; - 2. 

In [10] it was shown, i.a., that the operations of scaling and convolution 

of rational density functions can be translated into linear algebra operations 
on corresponding state-space realizations. For ease of reference these results 
are collected in the Appendix A.5. 
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3.2 The composition formula 



A key step in the present paper is the construction of a reaHzation of the 
rational density function given by pu{j^l^)/V{x), where y is a fixed non- 
zero real number and V{x) is a positive polynomial on the real line, if a 
realization of = puix) is known. Such a realization is constructed 

via the following proposition which gives a realization formula for the com- 
position G = Gi o g2, G{s) = Gi{g2{s)) of two proper rational complex 
functions Gi and g2- Here we will apply this result to Gi{ix) = pu{yx)x 
and g2{ix) = i/V{x). This implies that we could allow V{-) to be a rational 
positive function such that 1/V{x) is strictly proper. 

The only constraint on the pair Gi , 52 will be that the direct feedthrough 
d2 = lim^^oo 52(5) is not a pole location of Gi, i.e. Gi{d2) 7^ 00, because 
otherwise the composition Gi o g2 would have a pole at infinity, or in other 
words it would not be proper rational function and therefore would not 
have a state space representation of the form that we use here. In fact 
in the proposition we will allow Gi even to be a rational matrix function, 
corresponding to a multi-input, multi-output system in the system theoretic 
interpretation. 

Proposition 3.2 Let Gi,g2 be proper rational functions with state space 
realizations {Ai G C"i^"i,Bi G C"i^™i,Ci G CPi^"i,Di G C^'i^^i) and 
{A2 G C^^x'^a^t^ G C"2xi,c2 G Ci^"2,d2 e C) respectively, so Gi{s) = 
Di + Ci{sl — Ai)~^Bi, g2{s) = d2 + C2{sl — A2)~^b2- Assume that d2 is not 
an eigenvalue of Ai. Then the composition G = Gi o g2 is again a proper 
rational (matrix) function with state space realization {A, B,C, D) given by 
the formulas 

A = 0A2 + (Ai -d2/„J-^®62C2 GC'^i'^^xnina 

B = -(Ai-ci2/nJ-'SiC5 62 GC^^^^X"*! 

C = Ci(Ai-d2/„i)-l®C2 GCfiX'^i"^ W 
D = L>i-Ci(Ai-cZ2/ni)-'Bi GCfiX"*! 

Here (g) denotes the Kronecker product, see e.g. [12], ch. 12. 
Proof: Use will be made by the following inversion formula for rational 
matrices that is well known in system theory. Let (A,B,C,D), with D 
invertible, denote the state space realization of a proper rational function 
G{s) = b + C{sl - Ay^B. Its inverse is given by (G(s))-i = D'^ - 
D-^C{sI -A + Bb-^C)-^BD-^. 

We need to show that the rational matrix G{s) with state space real- 
ization given by (6) is equal to Gi{g2{s)). In order to do that we calculate 
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G{s) as follows: 

G{s) = D + C{sl - A)-^B 

= D^-CMi-d2ln,)-^Bi 

- {Ci{Ai - dain,)-' ® C2) H-^ ((^1 - d2ln,)-^Bi ® 62) 

= Di-Ci{Ai-d2ln,)-^Bi 

-CMl - d2ln,)-\ln, ® C2)H-\ln, ® 62)(^1 - d2ln,)-^ Bi 
= D^ + Ci[-{A^-d2ln,)-^ 

-{Ai - d2lm)~\lm ® C2)H-\ln, ® 62)(^i - d2lm)~^] B^. 

where 

H = Slnin2 - Ini A2 - {Ai - d2lni)~^ ® b2C2 

= Sini ® /na " ^ni A2 - (Ini <^ &2)(^1 - (^2/ni)"H-fn2 1^ C2). 

This expression has the form Di + Ci{G{s))~^ Bi, where G{s) = D + C{sl — 
A)~^B and 

D = -{Ai - d2lm), C = {In, C2), A = /„, A2, 5 = 62). 
It follows that 

G{s) = -{Ai - d2lni) + {Im C2) {Sln,n2 - Ini (S) A2)~^ {In, (S) b2) 
= Ini (S) {d2 + C2{sln2 - A2)"^62) " ^1 
= /ni (^2 + C2(s/„2 - A2)-^b2) - ^1- 

and thus 

G{S) = Di + Ci {In, {d2 + C2{sln2 " A2)~'62) " ^l)'' ^1 = ^1(32(5)). 

□ 

Remark. A special case of the composition formula can be found in 
the theory of phase-type distributions in statistics. See e.g. [17] and [18], 
equations (2.5.1), (2.5.2). 

3.3 Transformations between the various representations of 
rational density functions 

In order to implement the filter we need various different representations for 
the conditional densities, i.e. the integrable spectral density <I>, the spectral 
summand Z, Z + Z* = and the spectral factor K, KK* = Thus 
we need procedures which compute such a representation from any of the 
others. The computation of $ from Z ot K follows from the formulas in 
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Appendix A. 2. The computation of a spectral summand Z from given K 
or $ is dealt with in the appendices A.3 and A. 4 respectively. The most 
demanding task is the computation of a spectral factor given a spectral 
summand and this will be presented in the following subsection. It should 
be noted that this spectral factorization problem is a standard problem in 
systems theory. However most of the literature deals with the case where 
lim^-joo ^(s) = > holds, i.e where there is no zero at infinity. Thus we 
found we had to develop a numerically robust procedure for the case where 
$(s) has a zero at infinity. 

Let a spectral summand = (7(sl„ — A)^^M be given. Now the task 
is to compute B such that Ki^s) = C{sln — A)^'^B is a spectral factor, i.e. 
such that ^>(s) = Z{s) + Z*{s) = K{s)K*{s) holds. The basic tool for this 
conversion is the so called positive real lemma: 

Lemma 3.1 A stable rational function Z{s) = C{sln — A)^^M is positive 
real, i.e. Z(ix) + Z*(ix) > for all x E if and only if there exists a 
solution P of the linear matrix inequality (LMI) 



L{P) 



-AP - PA* M - PC* 
M* -CP 



>0 (7) 



IfP is a solution, then^{s) = Z{s) + Z*{s) = K{s)K*{s), where K{s) = 
C{sln — A)~^B and B G C"^*" is determined from 

(8) 



" B ' 




" B ' 











L{P) = 

For a proof of this lemma see e.g. [5] . In addition we remark: 

(i) The rank of L(P) determines the column dimension of the function 
K{s). In particular it can be shown that there always exist square 
factors K. Since we here deal exclusively with the scalar case, we 
are only interested in rank one solutions, i.e. in solutions P where 
rankL(P) = 1. 

(ii) By the asymptotic stability of A it follows that any solution P of the 
LMI is positive semidefinite. 

(iii) The solution set V = {P \ L{P) > 0} is convex and bounded. If the set 
V is non-empty it contains a minimal and a maximal element, P and 
P say, i.e. P < P < P holds for all P E V. The minimum element 
corresponds to a minimum phase factor, say, i.e. all zeros of K{s) 
are in the closed left half plane: K_{s) = ^ 5?(s) < 0. Analogously 
P gives a maximum phase factor K, i.e K{s) = ^ 3fi(s) > 0. 
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By the positive real lemma it follows that the computation of the spectral 
factor is equivalent to the solution of the above LMI. A solution of this LMI 
will be constructed via the computation of what is known as a deflating 
subspace of the (2n + 1) x (2n + 1) dimensional pencil: 



XE-N := 



Xhn - F 


G ■ 


-H 






XIn- A 

AI„, + A* 



-C 



M* 



M 
C* 







(9) 



Note that the eigenvalues of this pencil are the zeros of ^{s). See Ap- 
pendix A.l for background material on pencils of the above form! 

Suppose for the moment that we have given a (rank one) solution P = P* 
of the LMI and the corresponding factor K{s) = C{sl„ — A)~^B. Further- 
more let /c = 2c be the co-degree of $ and thus c is the co-degree of K. By 
sonic easy algebra it follows that 



XIn- A 


-M ' 




" P 


' 




" p 


B 


XIn + A* 


-C* 




In 







In 





-C M* 










1 











XIn + A* 


-C* ' 


B* 






This implies 
(i) 



P 

In 

1 



is a basis for a deflating subspace of the pencil {XE — N). 



XIn + A* 


-C* ' 


B* 






(ii) 



is the pencil corresponding to the zeros of K*{s) and thus has a {c+ 1) 
dimensional infinite elementary divisor and an (n — c) dimensional 
finite divisor corresponding to the finite zeros of K*{s). 

In order to construct a rank one solution of the LMI (7) we therefore have 
to compute an (n -|- l)-dimensional divisor of the pencil (9) which itself has 
a (c+1) dimensional infinite elementary divisor and an (n — c) dimensional 
finite divisor. Let Z G £^i2n+i)x{n+i) -^^ ^ basis for the corresponding 
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deflating subspace, where in addition it is assumed that the first c+1 columns 

form a basis for the (c+1) dimensional deflating subspace corresponding 
to the (c+ 1) dimensional infinite elementary divisor. By the discussions in 
Appendix A.l it follows that Z may be partitioned as 



Z\2 Zi^ 
Z22 Z23 
Z31 Z33 

Note that Z can be written as 



^12, ^22 G C^^^ Zi3,Z23 GC"^^"- 



(10) 



Z = 



P 

In 

1 



where T is a (n + 1) x (n + 1) non-singular matrix. Hence the solution P is 
obtained from 



P — [Z12, Zis] [Z22, Z; 



23J 



fll) 



The only remaining choice is the choice of the finite eigenvalues, which de- 
termine the zeros of the factor K* . E.g. in order to get the minimal solution 
P (the minimum phase factor K) one has to choose the {n — c) anti stable 
eigenvalues 3fi(Ai) > 0. On the other hand choosing the stable eigenvalues 
3?(Aj) < gives the maximum element P and the maximum phase factor K. 

This procedure works provided that there are no zeros on the imaginary 
axis (except for the zero at infinity). Therefore for our implementation of 
the filter in addition we assume that pu{x) and pxii^) arc strictly positive, 
which implies that all conditional densities in the filter will be strictly pos- 
itive. However the numerical implementation still may run into trouble if 
there are zeros "close" to the imaginary axis! 

The actual procedure is now as follows: Start with the pencil (9) and 
bring it to the staircase form (20,21). Apply a QZ transformation to the 
lower right ((2n — 2c) x (2n — 2c)) dimensional block to bring the whole 
pencil into a QZ form, see e.g. [6]. So QEZ and QNZ are upper triangular 
matrices and Q and Z are both unitary. Next by a sequence of 2 x 2 
orthogonal transformations the diagonal elements corresponding to the n — c 
anti-stable (stable) eigenvalues are shifted to positions c + 2, c+ 3, . . . ,n+l, 
without losing the triangular structure. The desired basis for the deflating 
subspace then is given by the first n -\- 1 columns of the final Z matrix. 
Finally compute P as described in (10, 11) and B from (7), (8) in Lemma 
A.l. 
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3.4 Description of the filter in terms of state space formulas 

We can now describe how the filter could be calculated using state space 
formulas. Recall that for each rational probability density we associate three 
rational functions, namely the spectral density the spectral summand Z 
and the spectral factor K. Above it has been discussed how one can obtain 
the state space realization of K given the realization of Z. In Appendix A.3 
it is shown how to compute a state space realization of Z given a state space 
realization of K and in A. 4 a realization of Z is computed from a realization 
of A state space realization of $ given a state space realization for Z 
or K follows from the formulas in Appendix A. 2. Therefore we can switch 
between these state space realizations as needed. 

To start consider the probability density Pxt\y^^^ °^ t = \ the density 
pXi- Calculate its spectral factor, K\ say. Consider the spectral density 
of \J and construct a state space realization for G(s) = ^ij{^—iYts){—is\ 
where Yt is the observed output variable. Construct a state space real- 
ization of the spectral density function g(^s) = is). Then use the 
composition formula to obtain a realization of the spectral density G o g of 
pu{Yt/V{x))/V{x). Calculate the realization of the spectral factor, K2 say, 
of this density. Construct the product of Ki and K2 (see Appendix A. 2), 
this gives the realization of the spectral factor of ctPxt\Y* ■ Calculate the cor- 
responding spectral summand. Compute the normalization factor a from 
Proposition 3.1 and compute the realization of the spectral summand of 
Pxt\Y^- This will be the input for the prediction step. 

Calculate the realization of the spectral summand oipg^Xt\Yl using the 
formula for the scaling, see Proposition A.l, part (i) in Appendix A. 5. Form 
the realization of the spectral summand of pw- Construct the realization 
of the spectral summand of Pxt+i\Y-^ — PaXt\Yl * Pw using the convolution 
formula given in Proposition A.l, part (ii), in Appendix A. 5. 

Now, as soon as a new observation Ij+i becomes available one can pro- 
ceed to a new update step. 

The behaviour of the co-degree and the state-space dimension and McMil- 
lan degree of the conditional densities in the filter can now be described. 
First consider the co-degree. Let k-^Q denote the co-degree of px^ and k^g 
the co-degree of Pxt\Y{- We know that the co-degree of a rational proba- 
bility density is even. Let d denote the degree of the polynomial V, then 
the rational function g constructed above has co-degree d. Because by as- 
sumption the probability density of U is (strictly) positive, it follows that 
the co-degree of Go ^ is d. Hence the co-degree of px^\Y^ is = + d. 
As non-zero scaling does not affect the co-degree and convolution of two 
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rational densities leads to a rational density with co-degree equal to the 
minimum of the two co-degrees of the arguments of the convolution (see the 
notes after Proposition A.l in Appendix A. 5) we find = min(fc(|^, k^), 

where kw denotes the co-degree of the rational density of W. This makes 
that the co-degrees of the conditional densities Pxt+i\Y-^j t > 1 are bounded 
by kw- This will turn out to be important in the next section. 

Now let us turn to an analysis of the state-space dimensions and the 
McMillan degree. For the composition and convolution of two rational func- 
tions the state dimension of the output is the product of the respective 
dimensions of the inputs (see Propositions 3.2, A.l) whereas for the product 
the dimensions add up, sec Appendix A. 2. Therefore in each step the state 
dimension of the realization of the conditional densities tends to increase 
dramatically. (Note that together with the result on co-degrees this sug- 
gests that the resulting conditional probability density functions will have a 
non-trivial numerator, even if one uses Studcnt-t or Cauchy densities for the 
disturbances). Theoretically it is possible that the resulting state-space real- 
ization is not-minimal, in which case the McMillan degree would be smaller 
than the state-space dimension and a state-space reduction procedure could 
be applied. In practice we do not expect this to happen very often. However 
if this is the case approximately, one can profitably apply model reduction 
techniques, to keep the state-space dimensions manageable. We suggest to 
apply model reduction at each time step to approximate the high degree 
rational density by a lower degree rational density. This will be the 

topic of the next section. 

4 Balancing and balanced model reduction 

There are many possibilities for model reduction. The challenge here is 
that the approximant has to be nonnegative on the real axis. In terms of 
the corresponding spectral density this means that the approximant spectral 
density has to be nonnegative on the imaginary axis. In terms of the spectral 
summand this means that the approximant has to be positive real. 

One well known method to achieve this is the so called "positive real 
balanced truncation" technique, see e.g. [3], which we will shortly explain 
here. 

Note that the solution set V contains two particular elements, namely the 
minimal and the maximal element, P < P say, and it has been discussed 
in section 3.3 how to compute these elements. A state space realization 
[A, M, C] of a spectral summand Z{s) = C{sln — A)~^M is called positive 
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real balanced iff 



P = P = S = diag(o-i, ...,an), > (72 > • • • > fT„ 



holds. The cj.j's are called the positive real singular values of Z. Since P<P 
holds and since the squared singular values are the eigenvalues of PP it 



follows that these singular values are bounded by < cTj < 1. Furthermore 
it is known (see [9], Theorem 4.1) that cri = ■ ■ ■ = (Tc = 1 and 1 > aj for all 

j = c + 1, . . . ,n, holds, where k = 2c is the co-degree ^ = Z + Z*. 

It is easy to see that a state space transformation 
[A, M, C] [TAT-^,TM, CT'^], where T G C"^" is a non singular matrix, 
transforms P and P~^ as P ^ j-pj^* ^nd P~^ ^-*p-ij,-i therefore 
such a balanced realization may be obtained by the following procedure. 
Suppose P and P are given and let P = pV2p*/2 p ^ pV^p*/^ 
be some arbitrary factorization of these positive definite matrices. Here 
M^/^ denotes a square root of a positive definite matrix M > 0, i.e. M = 
mV2(mV2)*. In addition we use the notations M*/^ = (mV2)*^ M-V2 = 
(MV2)-i and M-*/2 = (M*/2)-i. Next let p*/2p-*/2 ^ jjj.y* ^-^^ jj^ y 
unitary matrices, be a singular value decomposition (SVD). The state space 
transformation 



T = i;~V2y*p- 



-1/2 



j.l/2jj*p-l/2 



then gives the desired balanced realization, since 

rp-*p-'^'p-l ^ j^l/2y*p*/2-p-l-pl/2^^i/2 ^ ^ 

Let [A, M, C] denote the balanced realization obtained by this procedure 
and let these matrices by partitioned as 



T 
1 



A 


M ' 


C 






p-l 





^1 


Al2 


Ml ' 




A22 


M2 


Ci 


C2 






The (positive real) balanced truncated model Z is then defined as Z{s) = 
Ci{slm — Aii)'~^Mi, where m is the order of the reduced order system Z, 
i.e. An G C"*^'", Mi G C"*^^ and Ci G C^^'". 

It is important to note that 



M* -CT, 



> 
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and the diagonal structure of S implies that 

> 



-AnSn - Sn^Ji Mi - SiiCJ 
Ml* - CiSii 



This ensures that the reduced order model Z is positive real, see Lemma 3.1! 

The order m of the reduced order model may be chosen such that the 
approximation error docs not exceed an a priori given bound. In [8], equa- 
tion (4.30), the following relative error bound for the spectral densities is 
given: 

\mx) - ^ix)\/^(ix) < ( f] + ""fc)^ I - 1 for all X G M (12) 

\k=m+l / 

where $ = Z + Z* and % = Z + Z*. Let = 2c be the co-degree of <&(s). By 
the discussion above it follows that this bound is finite if and only if m > c 
holds. Furthermore note that for m > c the reduced order spectrum $ also 
has co-degree 2c, see [9], Theorem 6.1. 

From (12) it is easy to derive an error bound for the corresponding prob- 
ability density functions. For simplicity assume that $ is normalized, i.e. 
J ^{ix)dx = 1 and thus p(x) = ^{ix) is apdf. Let p{x) = ^{ix)/{f '^{ix)dx) 
denote the approximation of p{x) and let < r < 1 denote the error bound 
on the right hand side of (12). From (12) we obtain $(ia;)(l — r)) < ^{ix) < 
^{ix){l + r)) and thus 

/oo 
^{ix)dx<{l + T). (13) 
-oo 

Therefore it follows that 

1 + T 2r 

\p{x) - p{x)\/p{x) < 1 = for all x eR. (14) 

1 — T 1 — r 

It should be noted that in our experiments we observe that (13) is only a 
rough upper bound for the "integrated" approximation error and thus (14) 
is a conservative upper error bound. 

In our implementation of the filter a model reduction step is included 
after each prediction step. This means after we have computed a realization 
of the spectral summand of Pxt+i\Y^^ '^Pply the above described scheme 
to get a realization of an approximant. This will be used instead oi px^_^_^\Y*- 
The order m of the reduced order model is chosen such that the above error 
bound (14) does not exceed a given threshold 1 > r > 0. Note that the co- 
degree of px^_^_j^\Y-l- is bounded by the co-degree oipwt and that the reduction 
step does not alter the co-degree! 
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5 Autocovariance function and estimation 



In this section we analyze the properties of the processes (Yf) and In 
particular it will be shown, given some suitable assumptions, that (Yj) is a 
white noise process and that (|lt|) is an ARM A process. The mean and the 
auto covariance function of may be easily computed from the model 

parameters in particular from the coefficients of the polynomial V{x) and 
from the moments of the noise processes (Wt) and ([/*). This enables the 
use a simple method of moments to estimate the model parameters. 
The standing assumptions in this section are as follows: 

(i) V{x) = Vo + Vix + ■ ■ ■ v^x^ is a non negative polynomial {V{x) > for 
all a; G M) and it has order d. 

(ii) The processes (Wt) and (Ut) are two i.i.d processes, which are inde- 
pendent from each other. The moments Mw{k) := ¥,W^ exist for all 
< A; < mw and mw > 2d holds. The moments Mu{k) := EU^ exist 
for all < < mu and mu > 2 holds. 

(iii) The parameter a is bounded by \a\ < 1. 

Note that within this section it is not needed that Wt and Ut have rational 
probability density functions. 

The main result is given in the following Proposition: 

Proposition 5.1 Under the assumptions (i), (ii) and (iii) there exists a 
strictly stationary solution {XtiYt) of the model (1). 

The moments Mx{k) := EXf exist up to order mx = mw and may be 
computed recursively from the relations (starting with Mx{0) = 1) 

Mx{k) = ^— '^'^w{k - l)Mx{l) ; l<k<mw (15) 

The process (Yt) is a white noise process. 

The process Zt = \Yt\ is an ARMA process of order less than or equal to 
d+l. 

Proof: Let Mt^k = X]j=i Since Mt^k is monotonically increas- 

ing with k, and since EMt^k < IE|VFt|/(l — \a\) is bounded, we conclude 

that limfc^oo ^'h,k and Xt := limfc^oo Sj=i ci^^^Wt-k exist a.s. Furthermore 
EXt = EWt/(l - a). 
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Now suppose that EX^' exists for 1 < Z < A; < mw- From (1) it follows 
that 

fc-i 



Since Ylt=o if) (^^^t^t ' ^ ^^ite mean it follows analogously that EX^ 
exists. By taking expectations on both sides of the above equation and by 

using the independence of Xf and Wt one obtains (15). 

Furthermore for k > 0, X,Vfe+i^' = {aXt+k+Wt+kYXi = EU 
and thus E(X,Vfe+i^t ) = Qa'Mw{t - imxl^,x{). 

Define Xt = (1, Xt, . . . Xf)' , Mx = EXt = {1, Mx{l), ■ ■ ■ , Mxid))' , 
V = {vo, . . . VdY and 

/ Qa^MwiO) \ 

Qa'^Mwil) {l)aMw{0) ••• i 

{l)a^Mw{2) {l)aMw{l) Qa^MwiO) ••• i 

: ■•. 

\Qa^Mw{d) ■■■ {'^)a''Mw{0) J 

Using these notations the above relations may be written as: Mx = 
FMx and EXt+k+i^i = FEXt+kXl- 

First consider the process (V{Xt)). It is immediate to sec that E,V{Xt) = 
V'Mx and EV {Xt+kW (Xt) = V'F''{EXtX't)V, ^ot k > 0. Note that F 
has eigenvalues 1, a, . . . , a*^ and that e = (1, 0, . . . , 0) and Mx are the left 
and the right eigenvectors corresponding to the eigenvalue 1. Furthermore 
eEXtX[ = M'-^. This implies that the auto-covariance function of V{Xt) is 
given by 

f V' {EXtX't - Mx M'x )V for /c = 

Coy{V{X,+k), V{Xt)) = I _ j^^^ _ ^^^^E^^^,^^ fo^^ > 

From the above representation it follows that {V{Xt)) is an ARMA process 
of order less than or equal to d + 1. Note that {F — Mxe) has eigenvalues 
0, a, . . . , a^. 

Next consider the process (Yt). We have EYt = EV{Xt)EUt = 0, by the 
independence of Xt and of Ut- The auto covariance function of (Yt) is given 

by 

EV{XtfEU^ foTk = 



EYt+kYt = E{V{Xt+k)ViXt))E{Ut+kUt) 



for/e>0 
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Finally let us consider \Yt\. The mean value of \Yt\ is E\Yt\ = KV{Xt)E\Ut\ 
and the second moments are given by = ¥,V{Xt)'^KU^ and E|l(+jt||lt| = 

E{V{Xt+k)V{Xt))mUt\)^ for k>0. This implies 

rovnr \ \Y\)-S - iEViXt)E\Ut\f for A; = 

□ 

Of course analogous calculations apply for \Yt\^, provided, that suffi- 
ciently many moments of Wt and of Ut exist. 



6 Simulation results 

All simulation and estimation results presented here are based on the fol- 
lowing specifications: 



Xt+i = aXt + Wt 

Yt = ^V{aXt)Ut ^'^^ 

This is a slight reformulation of the model (1). The idea is to fix the func- 
tion V{x) and the distributions of Wt and of Ut, which leaves the three 
parameters a, * and a for estimation. 
The function V{x) is chosen as 

V{x) = {l + ^r + 0.1 

which is a rough approximation of exp(a;/2). The additional constant 0.1 is 
added to ensure V{x) > 0. 

The inputs Wt and Ut are assumed to have scaled t-distributions. This 
means that cwWt has a t-distribution with nw degrees of freedom and the 
scaling constant chosen such that ¥,W^ = 1 holds. Analogously CfjUt has 
a t-distribution with nu degrees of freedom and the scaling is such that 
EUt = 1 holds. Throughout this section the integer parameters d,nw and 
nu are fixed and given by d = 4, nw = 9 and nu = 3. This implies in 
particular that the assumptions of Proposition 5.1 are fulfilled. 

The first part of this section deals with the estimation of the param- 
eters (a, ^',0"). Table 1 shows the moments of the process {\Yt\) for some 
combinations of the parameters a, * = 1.0, a. 
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In a small simulation study we have investigated the performance of a 
simple method-of-moments estimation, where 10 lags of the auto-covariance 
function of have been used. To be more precise let 

m(a,*,a) := (E|yt|, Var|yt|, Cov(|yt+i|, 1^*1), . . . , Cov(|yt+io|, |lt|)) 

and let niT be the sample estimate of this vector of moments given a sample 
of size T. Then the estimates (a, ^, a) are computed by minimizing 

||m(a, a) — ititW'^ 

The results for 1000 simulation runs for (simulated) data series of length 
T = 1000 are collected in table 2. Both the mean estimation error (mean) 
and the standard deviation (std) over these 1000 simulation are shown in 
dependence of the true parameters. Note e.g. that the estimate of a shows 
a significant bias especially for small a and a. However this is only a first 
rough estimation scheme and other enhanced estimates will be investigated 
in future. 

It has been mentioned in section 2 that the filter is able to compute 
the likelihood. However the computation of the filter is presently too time 
demanding to implement a maximum likelihood estimation based on the 
filter. 

Next we test the filter on some real world data. In particular we con- 
sider data which also have been analysed by [14] . The authors consider five 
exchange rate data series and study the empirical performance of stochastic 
volatility models. Here we only consider the Dollar/Yen exchange rate data, 
which consists of T = 1102 weekly observations from 3 January 1973 until 
9 February 1994. 

The parameters of the model (16) have been estimated by the method 
of moments as described in the previous section. Here 25 lags of the auto 
covariance are used and the obtained estimates are a = 0.957, a = 0.309 and 
^ = 0.921. Figure 1 shows the sample ACF and the fitted ACF. Next the 
filter is run on this data set to compute a one step ahead prediction of 
The result is shown in figure 2. Note that, since the stationary solution of 
the state Xt is not rationally distributed as far as we know, we have simply 
assumed that Xi has a scaled t-distribution with nx = 9 degrees of freedom 
and the scaling was chosen such that the variance of Xi is equal to 1/(1 — a^) 
i.e. equal to variance of the stationary solution. 

Finally we consider some simulated data. The parameters were chosen 
as a = 0.9, ^ = 2 and a = 1.5. The simulation and the filter were initialized 
with a scaled t-distributed random variable Xi , where the degrees of freedom 
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is nx = 9 and the scaHng parameter is chosen such that the variance of Xi 
is equal to 1/(1 — a^). The length of the simulated series is T = 100. 

Figure 3 shows the simulated trajectory of Yt and the one step ahead pre- 
diction of \Yt\, i.e. E{\Yt\\Yl-^) = ^(E|;7t|)E(y(c7Xj) | F/"^). The condi- 
tional expectation 'E{V{aXt) \ Y^~^) is computed from the conditional prob- 
ability density function Pxt|y*~^' which is computed by the filter. See also 
Propositions 3.1 and 5.1. 

Figure 4 shows the conditional probability density function PXf^i\Y*j 
t = 100. In each time step of the filter balanced model reduction is used as 
described above. Let Pxt+i\Y* denote the approximation of the conditional 
pdf Pxt+i\Yl- Tli^ order m of the the reduced order system is chosen such 
that the relative error \Pxt+i\Y^i^) ~ PXt+i\Y*\/PXt+i\Y^{^) most 0.02, 
i.e. we allow at most an error of 2 percent. See equation (14). For this 
specific example typical model orders are n = 85 and m = 9, which means 
that the state dimension is almost reduced by a factor 10. If one compares 
the conditional expectation of Xt and of V{Xt) given the observations Y"/ 
computed from the full order pdfpx^_^j^\Y^ ^^^d from the approximant Pxt+i\Y* 
then in this example the relative error is of the order 10~^^. These numbers 
indicate the excellent quality of the used approximation scheme. 

Finally figure 5 shows the evolution of the conditional densities Pxt+i|y/ 
over time. 

7 Conclusion 

The exact filter for a class of stochastic volatility models is derived. A stan- 
dard stochastic volatility model in which the disturbances are Gaussian and 
the volatility function involved is exponential can be viewed as a limiting 
case. The complexity of the exact filter increases in the sense that the matri- 
ces that are used to represent the rational probability density functions tend 
to grow quickly. An approximate filter is presented in which at each time 
step the conditional probability density function of the state, which is ratio- 
nal, is replaced by an approximating rational probability density function, 
using the SBT method (stochastically balanced truncation). Using a well- 
known error bound the approximating rational probability density function 
can be chosen such that on each point of the real line the relative error is 
less than a given percentage (the tolerance level involved can be chosen by 
the user). In some simulated and empirical applications we find that using a 
tolerance level of as low as 2 percent still leads to an enormous reduction in 
complexity, keeping the order of the rational functions well within bounds 
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that are considered tractable with modern computers. Lower tolerance lev- 
els could also be achieved if desired, but then larger matrices will have to be 
handled. The model presented is very flexible, especially with respect to the 
specification of the probability density functions for the disturbances. Here 
one can vary between very heavy-tailed disturbances (with Cauchy density 
for instance) and less heavy-tailed disturbances (with Student-t densities 
that are approximating Gaussian densities for example) . In the applications 
in this paper we have stayed as close as possible to the traditional Gaussian 
model. However the possibility of specifying more heavy-tailed densities 
seems one of the most interesting features of this class of models. Exploring 
those possibilities is an interesting topic for future research. Also valuation 
of financial derivatives in a market in which the asset price movements can 
be described by a stochastic volatility model of the type investigated here, 
is an interesting topic for future research. More generally the methodol- 
ogy of working with rational density functions in filtering problems in the 
way presented here could have a much wider range of applications, as the 
methodology is really general and flexible and numerically stable methods 
for various operations involved are now provided. Preliminary experience 
with the methodology shows especially striking results deriving from the 
application of the SET approximation method. It is to be expected that 
this can also be successfully applied to the linear filtering problems with 
rationally distributed disturbances considered in [10]. 



A Results from system theory 

A.l Numerical calculation of the co-degree and of the zeros 
of a strictly proper rational function 

Consider a strictly proper scalar rational function^ 
G{s) = C{sl^ - A)-'B = ^o + ais + ... + a,s^ 

where 

q < n and a{s) = ao+ais+- ■ --l-a^s^, b{s) = bo+bis+- ■ ■+bns"' are coprime. 

The co-degree of G is defined as (n — g), i.e. as the multiplicity of the 
infinite zero of G(s). Since G is strictly proper the co-degree is positive. 



^In this section G is an arbitrary, not necessarily stable, transfer function. We will use 
results obtained in this section e.g. for a spectrum $ = KK* and for its factor K*. 
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The Taylor series expansion of G{s) at infinity is given by 

G{s) = Go + Gis"^ + + . . . 

= D + CBs-^ + CABs-^ + ■■■ 



Therefore the co-degree of G is related to the Markov parameters of G as 
follows. 

Lemma A.l The co-degree of G{s) = C{sln — A)^^B is equal to c iff 
• GA^-^B / and GA'-^B = for alll<i< c. 

Note that a naive check on GA^~^B = in order to compute the co- 
degree is numerically unstable since A might have eigenvalues of modulus 
larger than one and thus round off errors would "explode" . 

The (finite) zeros of the transfer function G{s) are the (finite) eigenvalues 
of the pencil: 



Therefore the co-degree and the finite zeros of G may be computed from 
the eigenstructure of the above pencil. We will make use of the following 
concepts, see e.g. [21]. A pencil {XE — N) is called regular if it is square and 
if det(A-E' — iV) is not constant. The zeros of det{XE — N) are the eigenvalues 
of the pencil. Suppose there exist full column rank matrices X,Y e C"^'^, 
k < n and matrices E,N e g^^j^ ^j^g^^ 



holds. The space spanned by the columns of X is called a deflating subspace 
of the pencil {XE — N). This is a generalization of the concept of invariant 
subspaces to arbitrary pencils. The /c-dimensional pencil {XE — N) is called 
a divisor of {XE — N). If ^ is non singular, then {XE — N) is called a finite 
divisor of {XE — N). In this case {XE — N) has k finite eigenvalues which 
are of course also eigenvalues of {XE — N). If there exist two non singular 
matrices S,T e<C}^^ such that 



XE-N : 



XIn-A B 
-G D 



(17) 



{XE - N)X = Y{XE - N) 



(18) 



/ -1 A 



\ 



-1 A 



S{XE - N)T = 











-1 A 



V 



-IJ 
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then {XE — N) is called an elementary infinite divisor. 

An alternative characterisation of the co-degree now is as follows: 



Lemma A. 2 The co-degree of G{s) is positive and it is equal to c iff the 
pencil (17) has an elementary infinite divisor of dimension (c + 1) and a 
finite divisor of dimension {n — c). 

Proof: To prove this lemma the pencil is transformed to a socalled staircase 
form as defined in [21]. This will also give a numerically robust way to 

analyze the co-degree and the cigenstructure of the above pencil. 

Let Ui G C"^" be a row compression of (— -B), i.e. Ui is a unitary matrix 
such that U^i-B) = [6, 0, . . . , 0]* and b>0. (Note that B + 0.) Apply this 
state space transformation and define 



\Jl 



A 


-B ' 




" J7i ■ 






-Bi 


C 


-0 




1 




[ Ci 










Note that CB = C^Bi and thus the first element of C\ is zero iff c > 1. 
In the next step let 



U2 



1 
U2 





-Bi 1 




" C/2 ■ 




r A2 


-B2 


Ci 







1 




[ C2 






where U2 G C" ^ is a row compression of the last n — 1 entries of the 
first column of Ai. Apply this state space transformation to get 

U*2 
1 

By construction the (1,1) clement of Ai and the first elements of Ci and 
of B\ are not affected by this transformation. Furthermore note that the 
last n — 1 elements of B2 and the last n — 2 elements of the first row of A2 
are zero. In addition we have CAB = 02^2-62 = iff c > 2. Thus c > 2 
holds iff the second clement of C2 is zero. 

Now this procedure is repeated until a nonzero element pops up in the 
/c-th position of C^. This is a possible way to estimate the co-degree of G. 
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After c + 1 steps of this kind we end up with a matrix of 



the form: 



* * 






* • • • 


... * 


© 








^ ... 


... ^ 


n 


■•• ■ 












••• • 


• © 


* 


* • • • 


* 





••• • 


. ... 




* ... 


* 





••• • 


. ... 





* ... 


* 




••■ ■ 


. ... 





* ■ ■ • 


* 





••• • 


. ... 


a 


* ... 


* 






(19) 



The horizontal and vertical lines partition the above matrix into blocks 
of size c, n — c and 1 respectively. Two particular elements of the above 
matrix, namely the (c+ 1, 1) and the (n+ 1, c) element, are denoted with /3 
and a respectively. Note that /3 > and a 7^ holds. 

Note that (for j < c) the columns + 1] of the matrix U = 

U1U2 Uc+i form an orthogonal basis of the column space of [B, AB, . . . , A^B]. 

By a permutation of rows and columns we bring the last column to the 
first position and the last row to the (c + l)-th position. Finally apply the 
Givens rotation 



Q = 



a 



(5* 
a 



1 



m 
q2i 



qi2 

q22 



to the rows c + 1 and c + 2. If Q and Z denote the concatenation of all these 
unitary row and column operations, then we have 



Q 



A 


-B ' 
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" © * • 




* • • 


* 




■•• ■ 
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••• • 


• ••• 
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* 
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(20) 



and 
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• 1 



Ell Ei2 
E22 



(21) 

Now this block upper triangular form displays the eigenstructure of the 
pencil {XE — N). Since Nn £ (j^c+ixc+i jg upper-triangular non-singular 
matrix it follows that {XEn — Nu) is an (c -|- 1) dimensional elementary 
infinite divisor of the pencil. Furthermore note that Z may be partitioned 
as 

" U 
1 

and that the first c+1 columns of Z form a basis for the deflating subspace 
corresponding to this infinite divisor. The same holds true if wc only take 
the first J + 1 columns, for < j < c. To be more precise consider the 
{j + I X _7 -I- 1) dimensional left upper sub- block of {XEu — Nu). By the 
triangular structure of the matrices En and Nu it follows that this sub- 
block defines an infinite elementary divisor and that the first j + 1 columns 
of Z span the corresponding defiating subspace. 

Since E22 G C^'^"^) ^ ("-^) is non singular it follows that {XE22 - N22) is 
an (n — c) dimensional finite divisor of the pencil. □ 



A. 2 Elementary operations on rational functions 

Let two strictly proper rational function Gi = Ci{sl — Ai)~^Bi with state 
space realizations {Ai, Bi,Ci), i = 1,2 be given. 
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A state space realization for G*(s) = Bl{—sl — Al) ^C* is given by 
The sum Gi + G2 has a state space realization: 



-Al 


CI ■ 


-Bl 








" A, 





Bl 







A2 


B2 




L Ci 


C2 






The product G1G2 has a state space realization: 





■ ^1 


-B1-B2 





G1G2 = TT 




















If CiBi = then Gi{ys)s is strictly proper and a state space realization is 
given by 



Gi{ys)s = TT 



Aiy-' 


AiBiy-^ ' 


Biy-' 






A. 3 Computation of a spectral summand from a spectral 
factor 

Suppose we have given a (stable) spectral factor K{s) = C{sln — A)~^B 
and that wc want to compute a spectral summand of = K(s)K*(s): 
Let P be the solution of the Lyapunov equation 

AP + PA* + BB* = 

and define M = PC* . The state space transformation T 

In P 



T = 
then gives 

In P' 
/„ 



/„ 



A -BB* 
-A* 



(22) 



In -P 
In 



A -AP - PA* - BB* 
-A* 





' A 










-A* 



' In P' 









- Pfj* - 




' M ' 


.0 I„ 




c* 




c* 




C* 



[ C 



In -P 



[ C -CP ] = [ C -M* ] 



and thus Z(s) = C{sln — A) is a (stable) spectral summand of $(s). 
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A. 4 Computation of a spectral summand from an integrable 
spectral density 

Let $ = H{sl2n — F)~^G be given. First compute a Schur decomposition of 
F such that the stable eigenvalues of F appear on the first n positions, i.e. 



V*FV 



Fii Fi2 
F22 



where F is a unitary matrix, F is an upper triangular matrix and Fii G C"^" 
is asymptotically stable. 

Solve the Lyapunov equation 

-FnP + PF22 + Fi2 = 



and set 



A = Fn 
M = [I,P]V*G 

c = Hv{in,oy 

to get a stable spectral summand Z = C{sl — A)~^M. 



A. 5 Operations on rational densities 

In [10] it was show that the operations of translation, scaling, multiplication 
and convolution of rational densities can be translated into linear algebra 
operations on corresponding state-space realizations of spectral summands. 
For ease of reference here we give some of these results which are needed for 
the implementation of the filter. Note that multiplication of two rational 
functions could be implemented via their summands. However using spectral 
factors seems to be numerically more reliable. Thus in our implementation 
of the filter we have chosen this approach. 

Proposition A.l LetX\ andX2 denote stochastically independent random 

variables with rational density functions pi, p2- For j = 1,2, let Zj{s) denote 
the corresponding stable spectral summand, with a state-space realization 
[Aj, Mj,Cj] with state-space dimension nj. 

(i) For a 7^ the random variable X = aX\ has a rational density whose 
spectral summand has a state space realization given by [A^ M, C] = 
[aAi,Mi,Ci] ifa>0 and [A,M,C] = [-aAl,Cl,Ml] ifa<0. 
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(ii) The sum X = Xi + X2 has a rational density function p = pi * P2, 
i.e. the convolution of p\ andp2, and the spectral summand of p has a 
state-space realization given by [A, M, C] where A = Ai®In2+Ini^-^2, 
M = Ml® M2 andC = Ci® C2. 

We finish this subsection with a note on the co-degree of the convolution 
of two rational probability density functions. Note that for two independent 
random variables, Xi, X2 say, it holds that 

^Xl + XsT < 00 if and only if ¥.\XiY < 00 and El^sT < 00. 

see e.g. [16], Problem 4.6.11. Together with 3.1 this implies that the co- 
degree of the convolution of two rational densities is equal to the minimum 
of the co-degrees of these two densities. This fact is used in the text to track 
the co-degrees of the conditional probability density functions arising in the 
filter. 
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a = 0.5 (7 = 1 



nYt\ 


a 


= 0.5 


0, 


.7202 


0. 


.7809 




a 


= 0.9 


0. 


.7797 


1. 


.0279 


VaT{\Yt\) 
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= 0.5 


0. 
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a 
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1, 


,2994 


4, 
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COTTi\Yt+l\,\Yt\) 


a 


= 0.5 


0. 


,0209 


0. 


,0619 




a 


= 0.9 


0, 


,1133 


0, 


,2270 



Table 1: Moments of the process \Yt 



This table shows the moments of the absoulute values of the outputs for some 
parameter values a, * = 1, a. 
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a = 0.5 a = l 

a- a mean a = 0.5 -0.3154 0.0297 

a = 0.9 -0.0522 -0.0000 

std a = 0.5 0.2203 0.2321 

a = 0.9 0.2322 0.0591 

* - * mean a = 0.5 -0.0347 0.0074 

a = 0.9 -0.0350 -0.0644 

std a = 0.5 . 0642 0.0726 

a = 0.9 0.1010 0.4151 

a- a mean a = 0.5 -0.1343 -0.2232 

= 0.9 -0.0428 0.0142 

std a = 0.5 0.4956 0.4751 

a = 0.9 0.3886 0.4262 

Table 2: Simulation results for the MM estimator. 



This table shows the results for 1000 simulation runs for (simulated) data series of 
length T = 1000. Both the mean estimation error (moan) and the standard deviation 
(std) over these 1000 simulation are shown in dependence of the true parameters. 
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Figure 1: 
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Figure 1: Dollar /Yen exchange rate: sample ACF (green) and fitted ACF 
of the absolute values. 
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Figure 2: 
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Figure 2: Dollar /Yen exchange rate: absolute values of the exchange rates 
(black) and the corresponding one step ahead predictions as given by the 
filter (yellow). 



36 



Figure 3: 
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Figure 3: Simulated data: time series plot of the simulated trajectories of the 
noise process Ut (blue) and of the outputs Yt = V{Xt)Ut (green). The gray 
shaded area is bounded by iby(Xt)E|f7f| i.e. by the conditional expectation 
of the absolute values of Yt given the state Xt. The dashed black line shows 
the corresponding conditional expectation of lYfj given the past observations 
as computed by the filter, (i.e. the one step ahead forecasts of \Yt\). 
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Figure 4: 

conditional pdf(X |t), t=100, rel err <= 0.0051 
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Figure 4: Simulated data: This figure shows the conditional pdf Pxt\Y^ 
for t = 100. Note that also the approximant probability density function, 
PXt+i\Yl say, as computed by the positive real balanced truncation method 
is plotted. However on this scale the full order probability density function 
and the low order approximation can hardly be distinguished, since for the 
relative approximation error {pxtlY^i^) ~ Pxt+i\Y^i^)\/PXt\Y^i^) — 0-0051 
holds by (14). The state space dimension of the realization of the spec- 
tral summand of Pxt+i\Y^ is n = 85 and the spectral summand of Pxt+i\Yl 
has order m = 9. The co-degree of the corresponding spectral densities 
'^Xt+i\Yliix) = pj^^^^^yt{x) and ^Xt+i\Y*iix) = is 10. 

The vertical black line marks the true value Xt-\-i and the dashed black 
line marks the corresponding estimate, i.e. the conditional expectation 
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Figure 5: 
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Figure 5: Simulated data: This plot shows the evolution of the conditional 
densities Pxt+i\Y{- Each "column" shows the conditional Pxt+i\Y{ ^ given 
time t, where high values are coded with red and low values of this pdf are 
coded with blue. In addition the solid black line shows the trajectory of Xt 
and the blue line marks the corresponding one step ahead predictions, i.e. 
the mean of the conditional densities Pxt+i\Y^- 
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